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Abstract 

This article is a short review on the concept of information. We 
show the strong relation between Information Theory and Physics, 
beginning by the concept of bit and its representation with classical 
physical systems, and then going to the concept of quantum bit (the 
so-called "qubit") and exposing some differences and similarities. This 
paper is intended to be read by non-specialists and undergraduate stu- 
dents of Computer Science, Mathematics and Physics, with knowledge 
of Linear Algebra and Quantum Mechanics. 

Keywords: Information Theory, Quantum Information, Quantum 
Computation, Computer Science. 



1 Introduction 

Physics is an important subject in the study of information processing. It 
could not be different, since information is always represented by a physical 
system. When we write, the information is encoded in ink particles over 
a paper surface. When we think or memorize something, our neurons are 
storing and processing information. Morse code uses a physical system, such 
as light or sound waves to encode and transfer messages. As Rolf Laudauer 
said, "information is physical" . At least for the purposes of our study, this 
statement is very adequate. 

Every day, we use classical systems to store or read information. This is 
part of human life since the very beginning of history. But, what happens 
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if we use quantum systems instead of classical ones? This is an interesting 
subject in the intersection of Physics, Computer Science and Mathematics. 

In this article, we show how information is represented, both in quantum 
and classical systems. The plan of our work is as follows: in Section [21 we 
argue about the physical character of information. In Section |21 we show the 
classical point of view of information, i.e., according to Newtonian Mechan- 
ics. In Section ^ the point of view of Quantum Mechanics will be shown. 

We also suggest some introductory references that explain most of the 
concepts discussed here ^ |3] . 

The main goal of this paper is to review some mathematical and phys- 
ical aspects of classical information and compare them with its quantum 
counterpart. 

2 Information is physical 

In its very beginning, Computer Science could be considered a branch of 
Mathematics, exclusively. However, since a few decades some scientists have 
been giving special attention to the correlation between Computer Science 
and Physics. 

One of the first physical aspects that we can raise in classical compu- 
tation is thermodynamics. How much energy is spent when the computer 
does a certain calculation, and how much heat is dissipated? Is it possible to 
create a computer that does not spend any energy at all? To answer these 
questions we will begin by examining Landauer's principle. 

According to Landauer's principle, when a computer erases a bit, the 
amount of energy dissipated is at least ksT In 2, where ks is Boltzmann's 
constant and T is the temperature of the environment. The entropy of the 
environment increases at least ks In 2. This means that any irreversible 
operation performed by a computer dissipates heat and spends energy. For 
instance, the AND logical operation 1 is irreversible, because given an output 
we cannot necessarily know the inputs. If the output is 0, the inputs could 
be 00, 01 or 10. This operation erases information from the input, so it 
dissipates energy, according to Landauer's principle. 

If one could create a computer using only reversible operations, this 
computer would not spend any energy. That would be a great achievement, 
given the fact that our modern society spends more and more in energy, and 
computers are responsible for great part of the problem. Charles Bennett, 
in 1973, proved that building a reversible computer is possible [SJ- The next 
step would be finding universal reversible gates, i.e., a gate or a small set 
of gates that allows the construction of circuits to calculate any computable 
function. E. Fredkin and T. Toffoli proved the existence of such gate in 

1 If the reader is not familiar with the concept of logical gate, we recommend the reading 

of m. 
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1982 6 . The Toffoli gate is equivalent to the traditional NAND operation 
(which is universal in classical computation) and works as follows: 

Toffoli(A, B, C) = (A, B, C A A B), (1) 

where © is sum modulus 2 and A in the logical AND. 

In priciple we could build a reversible computer by simply replacing 
NAND gates by Toffoli gates. That is not so simple to implement, though. 
Besides, one could question whether this gate is actually non-dissipative, 
since we generate a lot of junk bits that will need to be erased sometime. 
Bennett solved this problem by observing that we could perform the entire 
computation, print the answer (which is a reversible operation in Classical 
Mechanics) and then run the computer backwards, returning to the initial 
state. So, we do not need to erase the extra bits. 

Another interesting subject that leads us to the intersection between 
Computer Science and Physics is the Maxwell's demon. In 1871, J.C. 
Maxwell proposed a theoretic machine, operated by a little "demon" that 
could violate the second law of thermodynamics 

The machine would have two partitions, separated by a little door con- 
trolled by this demon. The modus operandi of this demon would be quite 
interesting. It would watch the movement of each molecule, opening the 
door whenever they approach, allowing fast ones to pass from the left to 
right partition and slow ones to pass from right to left partition. By doing 
that, heat would flow from a cold place to a hot one at no cost. The so- 
lution for this apparently paradox resides in the fact that the demon must 
store information about the movement of the particles. Since the demon's 
memory is finite, it will have to erase information in a moment, dissipating 
energy and then increasing the entropy of the system. 

The topics pointed out in this section show how close Computer Science 
and Physics are. In the next sections we will show how information is 
represented by Classical Mechanics, and what happens if we use Quantum 
Mechanics instead. 

3 On classical bits 

A classical computer performs logical and arithmetical operations with a cer- 
tain (finite) alphabet 2 . Each one of the symbols that compose this alphabet 
must be represented by a specific state of a classical system. 

Since we are used to perform calculations with decimal numbers it is 
very natural to think that the computer's alphabet should be composed by 
ten different symbols. However, it would be very expensive and complex 

2 The Turing machine was proposed by Alan Turing in 1936 and became very important 
for the understanding of what computers can do It is composed by a program, a finite 
state control, a tape and a read/write tape head. 
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to build a computer with this characteristic. Instead, computers work with 
2-state systems, the so-called bits, and represent binary numbers. 

The concept of bit was anticipated by Leo Szilard |Hj while analyzing the 
Maxwell's demon paradox. However, the word bit (binary digit) was first 
introduced by Tukey. The bit is the fundamental concept in Information 
Theory, and is the smallest information that can be handled by a classical 
computer. Every information stored in the computer is either a bit or a 
sequence of bits. 

If we join n bits, we can represent 2 n different characters. But, how many 
bits are necessary to represent all the characters in the English alphabet, 
plus the numbers and some special characters? If we use 8 bits, we can 
represent 256 characters, which is enough! To these 8 bits we give the name 
byte 3 . Another interesting unit is the nibble, which is formed by 4 bits. With 
one nibble we can represent all the hexadecimal numbers (2 4 = 16). Since 
the hexadecimal base is largely used in assembly languages and low-level 
computing, some computer scientists work with nibbles quite often. 

The byte is a very small unit, so we normally use some of its multiples. 
The kilobyte (KB) corresponds to 1024 bytes, i.e., 8192 bits. One could 
think that 1KB should be 1000 bytes, but as we are dealing with binary 
numbers, the power of 2 which is closer to 1000 is actually 2 10 = 1024. 

There are also some other useful units: megabyte (MB), which corre- 
sponds to 1024 KB, gigabyte (GB), equals to 1024 MB, terabyte (TB), 
equivalent to 1024 GB and petabyte, which corresponds to 1024 TB. 

At this point, the idea of Shannon entropy should be introduced [TD| . 
Shannon entropy is an important concept of Information Theory, which 
quantifies the uncertainty about a physical system. We can also look at 
Shannon entropy in a different point of view, as a function that measures 
the amount of information we obtain, on average, when we learn the state 
of a physical system. 

We define Shannon entropy as a function of a probability distribution, 
pi,p 2 ,. ■ ■ ,p n - 



where OlogO = 0, in the context of distributions or generalized functions. 
Note that lim x ^o (xlogx) = 0. 

This function will be explained in this paper through an exercise, which 
can also be found in This is an intuitive justification for the function 
we defined above. 

Suppose we want to measure the amount of information associated to an 
event E, which occurs in a probabilistic experiment. We will use a function 

3 Some authors say that the group of 8 bits are special because of the 80x86 processor. 
This processor used 8 bits to give memory addresses, i.e., it had 256 different addresses 
in memory. 
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1(E), which fits the following requirements: 

1. 1(E) is a function only of the event E, so we may write I = I(p), 
where p is the probability of the event E; 

2. / is a smooth function of probability; 

3. I(pq) = I(p) + I(q) when p, q > 0, i.e., the information obtained when 
two independent events occur with probabilities p and q is the sum of 
the information obtained by each event alone. 

We want to show that / = klogp, for some constant k. From the third 
condition of the problem, 

I(pq)=I(p)+I(q), (3) 

we can let q = 1, verifying that 1(1) = 0. Now, we can differentiate both 
sides of the above equation with respect to p. 



dl(pq) = dm + dm (4) 
dp dp dp 

d(pq) dp ~ 1 {P) (5) 

I'(pq) • q = I'(p). (6) 

When p = 1 we can easily note that 

I'(q)-q = l'(l). (7) 

Based on the second condition of the problem, we know that I'(p) is well 
defined when p = 1, so I' (I) = k, k constant. 



I'(q) = - (8) 

I(q) = I -dq (9) 
I(l) = k log q. (10) 

The function I(p) appeared naturally and satisfies the three conditions 
specified by the problem. However, the function I(p) represents the amount 
of information gained by one event with probability p. We are interested in 
a function that gives us the mean information, that is, the entropy. 



H =<I>= J2xPx(k log Px) 

J2xPx 

H =< I >= kJ2p x \ogp x 



(11) 
(12) 
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where k = 



— 1, and we have the Shannon entropy's formula: 



H = -^2p x logp x . (13) 

X 

If we apply l|13|) specifically to the case where we have a binary random 
variable (which is very common in Computer Science), this entropy receive 
the name binary entropy: 

Hbin(p) = -flogp - (1 - p) log (1 - p) (14) 

where p is the probability for the variable to have the value v, and (1 — p) 
is the probability for the variable to assume the value -if. 

Information Theory studies the amount of information contained in a 
certain message, and its transmission through a channel. Shannon's In- 
formation Theory was responsible for giving a precise and mathematical 
definition for information. 

Written languages can be analyzed with the help of Information The- 
ory ^2]- F° r a given language, we can define the rate of the language as 

H(M) , . 

where H{M) is the Shannon entropy of a particular message and N is the 
lenght of this message. 

In English texts, r normally varies from 1.0 to 1.5 bit per letter. Cover 
found r = 1.3 bits/letter in 13 j. Assuming that, in a certain language 
composed by L characters, the probability of occurence of each letter is 
equal, one can easily found the amount of information contained in each 
character. 



R = logL, (16) 

where R, the maximum entropy of each character in a language, is called 
absolute rate. 

The English alphabet is composed by 26 letters, so its absolute rate is 
log 26 ~ 4.7 bits/letter. The absolute rate is normally higher that the rate 
of the language. Hence, we can define the redundancy of a language as 



D = R — r. (17) 

In the English language, if we consider r = 1.3 according to ^3], and if we 
apply eq. (|16|) to find R, we find out that the redundancy is 3.4 bits/letter. 

We cannot forget that we deal with information every day. In this exact 
moment, you are dealing with the information contained in this paper. So, 
it is natural to ask how much information our senses can deal with. Studies 
have shown that vision can receive 2.8 • 10 8 bits per second, while audition 
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can deal with 3 • 10 4 bits per second. Our memory can store and organize 
information for a long time. The storage capacity of the human brain varies 
from 10 11 to 10 12 bits. Just as a comparison we can mention that the 
knowledge of a foreign language requires about 4 • 10 6 bits [Tl] . 

4 Introducing qubits 

The quantum bit is often called "qubit" . The key idea is that a quantum sys- 
tem will be used to store and handle data. When we use a classical system, 
such as a capacitor or a transistor, the properties of Classical Mechanics are 
still observed. On the other hand, if we use a quantum system to process 
information, we can take advantage of the quantum-mechanical properties. 

Quantum Mechanics has a probabilistic character. While a classical 
system can be in one, and only one state, a quantum system can be in a 
state of superposition, as if it was in different states simultaneously, each 
one associated to a probability. Mathematically, we will express this iV-state 
quantum system as the linear combination, 

N-l 

m = E ( 18 ) 

where a, are complex numbers called amplitudes. We know, from the Quan- 
tum Mechanics postulates, that ||a/c|| 2 is the probability of obtaining \k) 
when measuring the state l^). Then, 

N-l 

£lh|| 2 = i. (19) 

i=0 

In Quantum Computation we normally work with 2-state systems (oth- 
erwise we would not be referring to qubits, but qutrits, qu-nits or something 
similar). So, the quantum bits can assume any value in the form: 

m=a\0)+p\l) (20) 

with a, f3 complex numbers, and ||a|| 2 + ||/3|| 2 = l. It is important to stress 
that the amplitudes are not simple probabilities. The state "^(|0) + |1)) is 
different from ^j(|0) — |1)), for instance. In this case we say that the two 

states are different by a relative phase. However, the states and e l9 \^) 
(where 9 is a real number) are considered equal, because they differ only 
by a global phase. The global phase factor does not have influence in the 
measurement of the state. 

Superposition is quite interesting because while classical bits can as- 
sume only one value, its quantum counterpart can assume a superposition 
of states. A single qubit can value both and 1 simultaneously. Similarly 
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Figure 1: Bloch sphere. 



a n-qubit register can value, simultaneously, all the values from to 2 n — 1. 
Consequently, we can do the same calculation on different values at the same 
time, simply by performing an operation on a quantum system. 

Now, returning to the mathematical study of the quantum system, we 
can observe that a single qubit can be represented in a two-dimensional 
complex vector space. Of course, that does not help us so much in terms of 
geometric visualization. However, note that we may rewrite eq. <\2U\i : 



where 9, tp and 7 are real numbers. The global factor e* 7 can be ignored, 
since it has no observable effects. 



Now, we can represent a qubit in a three-dimensional real vector space. 
According to eq. (|19[) . the qubit norm must be equal to 1, so the numbers 9 
and ip will define a sphere: the so-called Bloch sphere. 

As we can see, there are infinite points on the Bloch sphere. Nevertheless, 
it is important to emphasize that all we can learn from a measurement is or 
1, but not the values of 9 or (p. Moreover, after performing a measurement 
the state will be irreversibly collapsed (projected) to either |0) or |1). Should 
it be different, we could write an entire encyclopedia in one qubit, by taking 
advantage of the infinite solutions of . 

If we wished to represent a composite physical system (which could be 
a quantum register, for instance), we would use an operation called tensor 




(21) 
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product, represented by the symbol <S>. The state of a quantum register \(f>) 
composed by the qubits \ipi), where i varies from 1 to n is 



10) = l^l) ® 1^2) ® • • • ® |V>n)- 



(23) 



We recommend that the reader refers to to get more information on 
this postulate. 

4.1 The no-cloning theorem 

There is a remarkable difference between classical and quantum states, which 
is the impossibility of the latter to be perfectly cloned when it is not known 
a "priori. This can be proved by the no-cloning theorem, published by W.K. 
Wooters and W.H. Zurek, in 1982 ^Hj- Here, we will prove that a generic 
quantum state cannot be cloned. The authors recommend the reading of 
the article cited before for a more complete comprehension. 

Let us suppose we wish to create a machine that receives two qubits 
as inputs, called qubit A and qubit B. Qubit A will receive an unknown 
quantum state, \t/j), and qubit B a pure standard state, \s) (such as a blank 
sheet of paper in a copy machine). We wish to copy the state \ip) to qubit 
B. The initial state of the machine is 



If the copy was possible, there were an unitary operator U such that 
U(\ip) (g) \s)) = \ip) (g> \tp). However, we wish our machine to be able to copy 
different states. So, the operator U must be such that U(\(j)}®\s)) = |</>)(g)|<^). 
The inner product between these two equations is 



It is easy to realize that the solutions for this equation are {ip\4>) = 1 
and {ip\4>) = 0, i.e., when \(f>) = or when \<p) _L \ip). The first solution 
is useless, so we proved that a perfect cloning machine is only able to clone 
orthogonal states. 

The non-cloning theorem leads us to a very interesting application of 
Quantum Mechanics: a provable secure protocol for key distribution that 
can be used together with Vernam's cipher to provide an absolutely reliable 
cryptography. The reader can refer to [2] for a short introduction to this 
subject. 

4.2 Von Neumann entropy 

Up to this point, we have been using the vector language to express Quantum 
Mechanics. From now on, it will be interesting to introduce another formal- 
ism: the density operator (also called "density matrix"). This is absolutely 



\4>) ® \s). 



(24) 



<v#> = am) 2 - 



(25) 
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equivalent to the language of state vectors, but it will make the calculations 
much easier in this case. Besides, the density operator is an excellent way to 
express quantum systems whose state is not completely known. If we have 
a quantum system with probability pi to be in the state then we call 
{pi, an ensemble of pure states. We define the density matrix for this 
system as 

p = 5>#i>(Vi|- (26) 

% 

Von Neumann entropy is very similar to the Shannon entropy. It mea- 
sures the uncertainty associated with a quantum state. The quantum state 
p has its Von Neumann entropy given by the formula 

S(p) = -tr(plogp). (27) 

Let Aj be the eigenvalues of p. It is not very difficult to realize that the 
Von Neumann entropy can be rewritten as 

S(p) = -J2 x * l °Z X *- (28) 

X 

Another important concept is the relative entropy. We can define the 
relative entropy of p to a as 

S(p\ \<r) = tr{p log p) - tr{p log a) (29) 

where p and a are density operators. 

According to Klein's inequality, the quantum relative entropy is never 
negative: 

S(p\\*)>0 (30) 

with equality holding if and only if p = a. The proof for this theorem is not 
relevant here, but it can be found in ^2 page 511]. 

4.3 Further comments on Quantum Information Theory 

The Quantum Information Theory is concerned with the information ex- 
change between two or more parties, when a quantum mechanical channel is 
used to achieve this objective. Naturally, the purpose of this paper is not to 
give a deep comprehension of this subject. Quantum Information Theory, 
as well as its classical counterpart, is a vast area of knowledge, which would 
require much more than just few pages to be fully explained. Instead, we 
give some basic elements, allowing the reader, independently of his area of 
knowledge, to have a better comprehension of Quantum Computation and 
Quantum Information Processing. 
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Quantum systems have a collection of astonishing properties. Some of 
them could, at least in principle, be used in Computer Science, allowing the 
production of new technology. One of these amazing properties we have 
already mentioned: it is the superposition. If in the future mankind learn 
how to control a large number of qubits in a state of superposition for enough 
time, we will have the computer of our dreams. It would be a great step for 
science. 

Another important property is the entanglement |1 1 1 llfij. Some states 
are so strongly connected that one cannot be written disregarding the other. 
In other words, they cannot be written separately, as a tensor product. This 
property brings interesting consequences. Imagine that Alice prepares the 
state below 4 in her laboratory, in Brazil: 



After that, Alice keeps qubit a and gives qubit b to Bob, who will take it 
to another laboratory, let us say, in Australia. Now, we know from the third 
postulate of Quantum Mechanics that if any of them measure the state, it 
will collapse either to |0) a |0)f> or to |l) a |l)&. So, the state of the qubit in 
Australia can be modified by a measurement done in Brazil and vice-versa! 

Reference is strongly recommended as a starting point, for those 
who want to study this topic more deeply. 

5 Concluding remarks 

In this paper, we have shown some of the main aspects of information. 
Information Theory normally considers that all information must have a 
physical representation. But, Nature is much more than the classical world, 
that we see every day. If we remember that the amazing quantum world can 
also represent information, we discover astonishing properties, leading us to 
a new field of study. Here, we briefly introduced this subject to students 
and researchers from different areas of knowledge. 

In Computer Science, we normally wish to represent some information, 
manipulate it in order to perform some calculation and, finally, measure it, 
obtaining the result. We began by showing how information is represented, 
in classical systems and in quantum systems. In a forthcoming work [3], we 
show how information can be manipulated in each case. 

Both classical and quantum information have similarities and differences, 
that were quickly exposed in this article. The technological differences are 
still enormous. While the technology to produce classical computers are 
highly developed, the experiments involving quantum computers are not so 
simple and have a slow progress. However, as we saw in this article, the 

4 This is one of the so-called Bell states. 




(31) 
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properties of quantum information are so interesting that the development 
of quantum computers in the future can become one of the greatest achieve- 
ments of our history. 
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